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■ Abstract 

^NJ ■ We report on some statistical regularity properties of greatest common divisors: 

for large random samples of integers, the number of coprime pairs and the average of 
' ^ , the gcd's of those pairs are approximately normal, while the maximum of those gcd's 

' (appropriately normalized) follows approximately a Frechet distribution approximately. 

We also consider r-tuples instead of pairs, and moments other than the average. 



1 Introduction 

p ^ ' The purpose of this paper is to report on some statistical regularities of the greatest common 

^ , divisors of random pairs, or, more generally r-tuples, drawn from large samples of integers. 

I For any given integer n > 1, let us denote by xj"'' , Xj"^ . . . a sequence of indepen- 

^ ■ dent random variables uniformly distributed in { 1 , . . . , n} and defined on a certain given 

probability space endowed with a probability P. 

The distribution of gcd{x["'\ ^'^2"'')' the gcd of a random pair, is given by 



;gcd(x("\x(")) = fc) = -i^MO-) 



jk 



> 

^SJ , for 1 < k < n. Asymptotically, a.s n 00, one has 

O ■ lim P ( gcd(X , X^"' ) = k)= ^ ^ 



C(2) P ' 



which, in particular, for k = 1, is the classical result of Dirichlet (see, for instance, [17], 
_ ^ Theorem 332) that 

For the mean and the variance of gcd(Xj"\ Xj"'') one has the asymptotic results 

E(gcd(xj"),x(")))^^ln(n) and V( gcd (X^"), X^"')) ^ [i (||) - l) 

as n — > 00. We refer to E. Cesaro, [5], E. Cohen, [8], P. Diaconis and P. Erdos, [12], and 
also to [14], for some further details and references. See also Section 3 of this paper. 
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Fix n > 1. For each integer m > 2, consider the random variable 



'■gcd(X<"',xj"') = l 
l<i<j<m 



l<i<j<m 

which counts the number of coprime pairs in a random sample of length m drawn from 
{1, . . . , n}. Observe that Cm^ does not exceed (™) and attains that maximum value precisely 
when the whole sample (^x["\ X2"'\ . . . , ') is pairwise coprime. The formula 

lim P((x["\...,Xj^'^A pairwise coprime) = lim Pf^^') = 



m 

p 

was advanced by M. Schroeder, [30], and proved by P. Moree for m = 3, [28], and for all 
m > 2 by L. Toth, [33], and also by J. Cai and E. Bach, [4]; see also [14]. In the case m = 2, 
this limit probability reduces to the classical result of Dirichlet mentioned above, T2 = 

As the size m of the sample tends to 00, the probability of pairwise coprimality 
tends to 0, see [33], and also [21]. This is to be compared with the extension of Dirichlet's 
Theorem, see Section 2 for references, that for each to > 2, 

lim P((X("\ . . . coprime) = lim P(gcd . . .,X^:^) = l) = -J- . 

Now, the probability -^^^ of just coprimality tends to 1, as the sample size m tends to 00. 

(n) 

The exact distribution of Cm , for sample size m given and fixed n, is combinatorially 
involved; see J. Hu, [20], for an interesting approach. 

In this p0yp6r wG provG thnt '' is asymptotically normal as to tends to 00 when n is fixed 
and, more generally, when n is allowed to vary with m, with the only restriction that n > 2. 

Theorem A. For each fixed n> 2, 

d"^-E(d"^) d ^, 

-j^=^= > Al , as TO — > CX3 . 

More generally, the conclusion holds with n replaced by any sequence Um of integers Um > 2. 

(This is Theorem 4.3 in Section 4). By wc mean convergence in distribution; M 
represents a standard normal variable. 

The counter Cm •* is a sum of (™) Bernoulli variables with common probability of success, 
but, of course, they are not independent. 

The analysis of Cm could be framed into, at least, two different approaches. On the one 
hand, for fixed n, we could consider Cm \ or rather C^Vl™)' 

as a sequence of [/-statistics 

associated to the symmetric kernel gcd(a;, y) and apply some general asymptotic results 
of W. HoefFding, [19]. Alternatively, we could consider the collection of random variables 
gcd(A'|"'' , xj"''), 1 < z < J < TO, as a family of locally dependent identically distributed 
variables and apply some general limit theorems for the sum of such a family, like those of 
S. Janson, [16], or P. Baldi and Y. Rinnot, [1] and [2]. This second approach appears to be 
more flexible, particularly when n is allowed to vary with to; it is the one we shall follow. 

Both approaches depend on appropriate estimates of covariances of pairs of variables 
("'"gcd(xf"' x<"')=i' ■'■gcd(x'."' x<"')=i) ' number theoretical nature, which we discuss at 
length in Sections 2 and 3. 
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We also consider some other natural [/-statistics like the sum of gcd of pairs from the 
sample (or its average, dividing by (™)), instead of counting coprime pairs. Consider the 
variables 

l<i< j<7n 

We have: 

Theorem B. For each fixed n> 2, 

Zi")-E(zl")) 



More generally, the conclusion holds with n replaced by any sequence Um of integers n,,„ > 2 
which verify Um < , for (3 < 1/2. 

(This is Theorem 4.8 in Section 4). Notice that, in contrast to Theorem A, it is now 
required that the size of the sample space Um does not grow too fast as compared with the 
sample size m. It would be interesting to know whether this is really required and not just 
a restriction of the method of proof. 

We consider also, in the opposite end, the random variables 
Ml:^^ mj.. {gcd(xf\xf))}. 

or, rather, their normalized version = C^") ^M^^^. In [9], Darling and Pyle have also 

considered these random variables and have obtained some interesting asymptotic results 
about their distribution of values. They have asked whether M.rn has a limit, in distribution, 
as TO — > oo. That this is the case is the content of: 

Theorem C. Let mP <n< e™^, for some /3 > 2 and 7 < i. Then, for any t > 0, 

Jim P(ATL"^<t)=exp(-^), 

SO that Aim tends, in distribution, as m ^ 00, to the Frechet distribution with shape 
parameter 1 and scale parameter 1/C(2). 

(This is Theorem 4.10 in Section 4). Our derivation of Theorem C is based on a classical 
result of Brown and Silverman (see [3], [32]) on Poisson approximation of [/-statistics. 

These theorems, A, B, and C, have corresponding counterparts for gcd of r-tuples, instead 
of just pairs, or higher moments of gcd instead of just first moments, which we discuss in 
Sections 5 and 6. 

The paper is organized as follows. Section 2 contains results about Euler's (p and Pil- 
lai's P function which are needed later. Section 3 discusses a formula of Cesaro and derives 
some estimates of marginal probabilities and expectations, and of the appropriate covari- 
ances. Section 4 contains the proofs of Theorems A, B, and C. Section 5 considers the 
extension of those results to r-tuples, while Section 6 discusses the extension to higher 
moments. Finally, Section 7 discusses a strong law for gcd. 



Some notation: 

At a number of places we shall have products indexed by prime numbers: Yip means 
product running over all primes p, while np<fc' Ilpifc ^^''^ products running over primes which 
are less than or equal to k, and over primes which divide fc, respectively. 
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We denote by Is the arithmetic function Is{j) = j" and simply write / for Ii. With 6k 
we denote the arithmetic function dk{j) = 1, if J = k, and Sk{j) = 0, otherwise. The number 
of divisors of an integer j > 1 is denoted by T{j). For any positive real number x, we denote 
by {x} its fractional part: {x} ~ x~ \_x\. The Mobius function is fi, and * denotes Dirichlet 
convolution. 

For two sequences of positive numbers (a„) and (&n), by a„ ~ 6„ as n — )■ oo, we mean 
that lim„^oo = 1- 

The uniform probability in {1, . . . , n} is denoted by P„, and expectation and variance 
with respect to P„ are denoted respectively by E„ and V„. Thus for a function (random 
variable) / defined on {1, . . . , n}, we have, for instance, E„(/) = i Sj=i /(j)- 

2 Euler's ip, Pillai's P function, and extensions 

We collect in this section a number of identities and estimates involving Euler's ip function, 
Pillai's P function, and their corresponding s-dimensional versions ifs (Jordan's totient 
functions) and Ps- 

2.1 Euler's and Jordan's function 

Euler's cp function, 

k 

= ^ lgcd(i,fc) = l, 

satisfies the identity ip ^ fi* I, and verifies that 

^^n(i-i). 

p\k 

Observe that ip{k) < k, for every integer fc > 1. 



for each fc > 1 , 



for each fc > 1 . 



2.1.1 A double series involving cp and gcd 

The following identity shall prove useful: 
Lemma 2.1. For every t > 1, 

i:«e'^(-)=cp'-i)n(i+^-;^ 

i,j=i p ^ ^ 



2 

+ - 


2 1 










2 


3 3 


2 


,t+i 







For every t > 1, we shall denote 

(2.2) M«):=f;0^«gcd(,,,). 

Observe, in particular, that M{t) < +oo for every t > 1, a fact which may be checked by 
bounding M{t) from above by the double sum 

E^-rgcd(.,,) = ^^, 

see [14] and the proof of Lemma 2.1. 

The second product expression for M{t) in (2.1) will be most convenient so as to apply 
some Tauberian theorem, see Corollary 2.2. 
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The proof of Lemma 2.1 uses the so-cahed zeta (probabihty) distributions on N, whose 
definition and some of their useful properties we now recall. For each real t > 1, the zeta 
distribution Qt on N is given by 

Qt(j) = 77^-7' for each integer j > 1 . 

For every prime number p, the random variable ap on N assigns to each integer j > 1 the 
largest exponent a > so that (thus p^^-'-'lj, but p"p(-?)+i | j); in particular, {ap > 0} 
is the event "divisible by p" , and, besides, 

j — , for each integer j > 1 . 

p 

With respect to Qt, the variables {ap}p are mutually independent, and, moreover, 
each ap is distributed as a geometric random variable on {0,1,2,...} with success prob- 
ability 1 — : 

Qt(ap = fc) = ^1 — , for each integer fc > . 

See Golomb [15], Diaconis [11], Kingman [23], or [14]; and particularly Lloyd [27]. 

Proof of Lemma 2.1. We first observe that the second infinite product expression follows 
from the first one just from the Euler product expansion for = Y[p i-^^-t ; so that we 
just verify the first one. 

Wc denote by the product probability Qt x Qt on and write Eq2 for the corre- 
sponding expectations. Consider the variable G on given by 

^ J 

Observe that, for t > 1, 

Wc introduce the auxiliary arithmetic function h given by h{j) = 1, if j > 1, and 
h{0) = 0, so that we may write ip and gcd in terms of the variables ap as 

— -n(i--)=n(i-^^^^) gcd(z,j)=np-"'"'"''''^'"^^^^ 

and then G itself as 

G{lJ) - n (l - ^^^^^) (l - MMi))^pmin(o,W,a,b)) ^ 
p P P 

which is an infinite product of mutually independent random variables. 
Now, for each fixed prime p, we have that 



1 - 



P 



V p J \ p J \ p* J p'-^ p*' 
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Split the range of the double sum mto the four ranges {k = 0,1 = 0}, {fc = 0, / > 0}, 
{k > 0,1 = 0} and {k > 0,1 > 0}, sum several geometric series and simplify to get the 
compact expression: 



(1 - l/p2t-l) V p* p2t+l. 

Now, since the ap's are mutually independent, we may write, at least formally, that 

to obtain the desired result. 

To justify the formal step, denote H{i,j) = gcd(i,j) = J|pp™'"("p(*)'"p(j)). From mono- 
tone convergence, independence and the fact that min (ap(i), ap(j)) is again a geometric 
variable on {0,1,2,...,} but with probability of success 1 — one deduces that 

1 A gcd(z,j) ^ ({21 -1) 



— — = Eq2 (iJ) = — — — < +00 . 



(See [14] for details and extensions.) Define, for every integer A'' > 1, the variable Gat, 
partial product, given by 



p<N 



Now, GN{i,j) < H{i,j), and limAr^oo GAr(i, = G{i,j), for any integers i,j > 1 and so, 
by dominated convergence, we deduce 

hm EQ.(GAr)=EQ.(G). 

N—^oo ' ' 

And, finally, since Gjv is a finite product of independent variables, we have 

and the proof is completed. □ 
The double sum which would correspond to i = 1 is infinite: 

the following corollary gives a suitable estimate for its rate of convergence to oo. 
Corollary 2.2. As N ^ oo, 



E ^#gcd(^,,)^Aln(iV) 



3 



where A is i/ie number 

A-i^n(i-^+^-^)-o,oii86. 
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[The summation above is over the set of integers i, j > 1 whose product i ■ j < N .) In 
particular, 

li-infw^ E ^#gcd(^,J)>A. 

N^oo ln(iV)3 ^ l2 l2 t, V - 

In the proof of Corollary 2.2 wc shall resort to (a particular case of) the powerful De- 
lange's Tauberian Theorem, which we may write as follows: 

Theorem 2.3 (Delange, [10], Theorem 1). Let A{z) := Y.kLi « Dirichlet series with 

nonnegative coefficients which has abscissa of convergence p > and is holomorphic on the 
whole axis ^{z) = p except at the point s = p. 

Assume that for two functions F{z) and G{z), holomorphic in ^{z) > p, and for some 
real (3 > we have 



Fjz) 

[z^pY 



(2.3) A{z)^j-^+G{z), for^{z)>p, 



and F{p) ^ 0. Then, as n — oo, 

(2.4) t-^-^)-'iHN)r\ 

For non integer /3, the power (z — p)^ in (2.3) means its principal branch. 

Proof of Corollary 2.2. Observe first that the asymptotic comparison closing the statement 
of the corollary follows simply from the fact that lcm(i, j) < i ■ j. 

Denote by B{z) the function 

B(z)=\\{l - - - - 1 



p2z p2z+l p3z pAz-\-l 



holomorphic and nonvanishing for 3?(z) > i. Observe that B{1) = Yip (l . p., 
12 A. Also, denote by C the entire function C{z) = (z — l)C,{z), for z € C, and observe that 
C(l) = l. 

Extend the function M given in (2.2) to a holomorphic function in 3?(z) > 1: 

E p^^gcdi^,J) = a2z-mzrB{z). 

For each integer fc > 1, define the positive coefficient 

^^gcd 

'i-j=k 

to express M as a Dirichlet series 

oo 

fe=l 

For 5ft (z) > 1 we may write 



M{z) = ^—^ ]-C{2z-l)C{zfB{z) 
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The function F{z) = ^C{2z - l)C{z)^B{z) is holomorphic for ^{z) > i, and = ^B{1). 
Delangc's Tauberian Theorem (with p = 1, /? = 3, -F as above and G = 0) gives then that 

^ 5(1) 

fe=i ^ ' 

From summation by parts, we finahy deduce that 

y ^ - Hnf = Ahi(n)3 , as n ^ cx) , 
■^-^ k 12 

/c=l 

and, therefore, as desired, that 

E ^#gcd(^,J)-Aln(n)^ as n ^ oo . 

?-j <n 

□ 

2.1.2 Jordan's functions 

For each integer s > 1, the (s-) Jordan totient function, denoted here by (ps, is given by the 
convolution 

For each integer A: > 1, the function ips counts the number of s-tuplcs of integers (fci, . . . , fc^) 
with I < ki, . . . , ks < k, such that gcd(A:i, . . . , fcg, fc) = 1. Of course, ipi = ip. Observe that 



(k) = k' ^ ='^'I[(^- ^) ' foi" each integer k > 1 



j\k p\k 

Notice also that ipg satisfies 1 < (ps{k) < k'^, for each integer k > 1. 
For tps there is an identity analogous to that of Lemma 2.1 for tp: 

Lemma 2.4. For every real t > I, and for each integer s > 1 
(2-5) f:^^gcd(^,,) = C(2t-l)C(^)^- 

•nfi ' ' 2 2 1 



And a corresponding estimate: 
Corollary 2.5. For each integer s > 1 



pt+s pit p2t+s-l ' p2t+s ' p2t+2s-l ' p3t+s-l p4t+2s-l 



^ ' lcm(ij)<Ar ■' 



wher 



12 J-i V p2 pS+2 p2s+l p2s+3 



Of course, the constant Ai coincides with the constant A of Corollary 2.2. The proof of 
Lemma 2.4 proceeds along the same lines as that of Lemma 2.1, but using now the expression 



-j-j- / h{ap{k)) 
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2.1.3 Asymptotic behavior of averages of ip and of Lps- Schur's constants 

The following lemma records the asymptotic behavior of certain averages of (p. 
Lemma 2.6. For every integer I > I, 



(2.6) 



1 



S '^ lim -y 



m 
k 



no 



1 r 



1-1 



l\' 



Observe that S*} ' = (l — -^') = The case / 1 is just a direct consequence of 
the identity 

^ 7 V L 7 J 



!_ y(fc) ^ 1 y^y- Mi) ^ mO') 
n A: n ^-^ ^-^ i ^-^ i V L 7 J n 

k=\ k=l j\k ■' 3 = 1 ■' ■' 



The case / > 2 is a result of Schur (see [22], page 58). 

More generally, and following the same approach as in [22], we may obtain the corre- 
sponding results for ips: 



Lemma 2.7 (Schur's constants). For every integers s,l>l. 



(2.7) 



5, 



lim — 

rj — ^00 11 ^- — ^ 



k=l 



n(i-^ 



1 (1 ^ ^' 



Again, observe that Si"' = ^(^qryj- Notice, for later use, that for any integers s > 1 
and I > 2, 

(2.8) Sl"^>{s["^y; 

strict inequality. Actually, in this paper, only the exponents I = 1,2 are needed. 



2.2 Pillai's functions 

The arithmetic function of Pillai is defined for integer > 1 as 



P(fc) = ^gcd(j,/c), 

3 = 1 



Observe that P{k) may be written as 



d\k 



SO that 



P{k) ^ ^ 

3\k ^ 



k 



Consider next, for each integer s > 1, the arithmetic function given by the convolution 
Ps = (f* Is; thus 



Observe that 
(2.9) 



P,(fc) = ^gcd(*,fc)^ 

Psjk) ^y^y(j) 

3\k 
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The function Ps may be written also as Pg = ips * I and interpreted alternatively as 



Ps{k) = ^ gcd{ii, i2,..., is, k). 



21 ,^2 — 1 

Notice also that 

(2.10) ^^4^ < ^^-7^ < ■^(fc) : for each integer fc > 1 . 

fc'' fc 

We refer to [34] for further information on P and Pg. 

Although both ifg = j-i*Is and Pg = (p* Ig arc well defined for real s > 1. we just consider 
the case s integer. 

2.2.1 Asymptotic behavior of averages of P and Pg 

We shall need the asymptotic behavior of the averages, first and second moments, of P 
and Pg. 

Since P ^ jj, * I * I, the Dirichlct series, with variable z, of Pillai's function is given by: 
Writing 

k ((z + l) ^ ^ 

we deduce directly, say from Delange's Theorem 2.3, that 



1 ^ P{k) 1 , , , 

-~ — ln(n), asn^oo. 



n 

k 



C(2) 

For s > 2, we may write, using Pg ~ /.i I * Ig, that 
to deduce, as above, that 

i^P.(fc) C(s) 



lin, _ 

n-i-oo n ^-^ fc-'* Q[s + 1) 



Thus, 

Lemma 2.8. As n oo, 
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To obtain the asymptotic behavior of the averages of P and Pg for exponent 1 = 2, we 
proceed as fohows. For s > 1, we may write 

l^fPs{k)^^_ l^f^^{j)\^_ 



(2-13) 
(2.14) 



k=l 



n ^ ( ^ 7" 

fe=l j\k 



n ^ — ' — ' 2* 7^ 

k=l i.j\k 



l<z,j<n 





n 






Llcm(i, j)J 


) 



For fixed integers i,j > 1, we have that 



< 



1 



gcd{i,j) 



lcm(i,j)J lcm(i,j) ij 



and also that 



hm — 

71— »oo Ti 



?cd(i,j) 



.lcm(i,j)J ij 

We spht the argument into the two cases s > 2 and s = 1. For s > 2, Lemma 2.1 gives that 

and dominated convergence then gives that 

fc=i 



For the case s = 1, write 
1 



fe=i 



lcm(i j) <n 



« 7 \n 



lcm(i,j) 



(Notice the range of summation.) Using that for any fixed integer K > 2, one has that 
lx\ > (l — 1/K)x, for any real x > K, we may bound 



1 " .P(fc).2^ y(») ^{j)n 

n ^ \ k J ~ ^ i i \n 

k=l lcm{i,j)<f, ^ 





n 


) 




Llcm(i,j)J 





>(1-1/A') 



1 J 

lcm(i,j)<-g- 



2 -2 oCd(i,j) 



Using now Corollary 2.2 we may conclude that 

1 l^/P(fc)N2 



liminf ■ 



E 



> (1 - l//v)A. 



and, consequently, 



liminf —^-V (-^ >A. 

n-)-oo in m r n ^ — ' \ k J 



k=l 



We record these results in the following 
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Lemma 2.9. For s = 1, 



(2.15) l--fw^-E(^)'^A, 



fe=i 

while, for s > 2, 

fe=l 



(2.16) li^ iy = Mis). 



Remark 2.10 (A theorem of L. Toth). A result of L. Toth (see [34], Theorem A) gives a 
precise version of (2.15): 



^ as n — > c» . 



fc=i 



where Axoth is the constant: Axoth = i^Y{p{^ + - p(p+i) )- ^i^i^c, for each p 

1 + — - ^ f 1 _ - f 1 _ A + A _ J_ 

p(p+l)'^ ^ ^ P"^ 

actually, Axoth = 2A. 

Observe that, from the discussion above and Toth's Theorem, one deduces that 

^ ^ gcd(z, j) ~ Axoth ln(n)3 , as n ^ oo , 

lcm(z,j)<n 

while, according to Corollary 2.2, 

y ^ ^ gcd(z, j) ^ A ln(n)3 , as n ^ C30 . 

z-j<n 

3 Cesaro's formula 

The following lemma, which comes from [5] and [6], gives a useful close formula for the 
expectation of any function of gcd: 

Lemma 3.1 (Cesaro's formula). Let F he any arithmetic Junction. Then, for any inte- 
gers n,r > 1, 



n 

(3.1) E(^^( gcd , . . . , = J- y * ^^) ij) 



n' 



The expression above is valid also for r — 1, with the conventional understanding that 
gcd(j) — j, for any integer j > 1. For a (detailed) proof we refer to [14]. 
For a fixed integer fc > 1, Cesaro's formula with F = Sk reads 



[Ecd{x["\...,Xi-^)=k) = ^Y,,ij)[^\ 



n' 

3<i 



from which one deduces the following asymptotic result, as n oo, for the probability 
distribution of the gcd of a random r-tuple: 

hm P(gcd(x("\...,X(")) =fc) = ^ ^ 



C(r) fc'- 
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which, for the case fc = 1, reads 

(3.2) hm P((xj"\...,X^")) coprimc) = , 

The case r = 2 is Dirichlet's Theorem. For r > 3, sec Ccsaro [6] (page 293), D. N. Lehmer [26] 
(Chapter V), and also [7], [18] or [29]. 

If we set _F = / in Ccsaro's formula wc obtain, since /i * / = that 



;gcd(xi"),...,x(«))) = -i^^(,)[^J 



71' 

from which the following asymptotic results for the expectation of the gcd of a random 
r-tuple are deduced: 



for r > 3, while for r = 2, 



(3.3) limE(gcd(X"\...,X(")))==^^ - 



(3.4) E(gcd(x("\x^"))) ^^ln(n), as n ^ oo . 

For the second moments of gcd, which we shall need later on, we have (see for in- 
stance, [12], and also [14] and the references therein): 

(3.5) forr>4, lim e( gcd 4"^, . . . , X^"))') = ; 

(3.6) forr = 3, E( gcd (x}"\ x'"', X^'^^ ) ^ ^ ln(n) as n ^ oo; 



(3.7) forr = 2, E( gcd (xj"', x'")) V [1 - l) 



n as n — >■ oo. 



The following lemma is elementary: 



Lemma 3.2. Let F he any arithmetic function. For each integer k > \, define Fk as the 
arithmetic function 

j>l^Fk{j)^F{gcdU,k)). 

Then for any integer j > 1 



0, ifj\k. 



This lemma, in conjunction with Cesaro's formula, provides a closed formula for marginal 
expectations: 

Corollary 3.3 (Cesaro's marginal formula). Let F be any arithmetic function. Then for 
any integers 1 < fc < n, and r > 1 

(3.8) E{F{gcd{xi-\...,X^-\k))^^Y.(^'F)(^) -T- 

(n) 

On the left hand side we have expectation marginal on X!^_^^^ = fc, while on the right the 
sum extends only over divisors of fc. 
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3.1 Marginal probabilities and expectations 

Two particularly relevant cases of the marginal formula are obtained by setting F = Si 
and F ~ I. They will appear quite often along this paper; specific notations are in order. 

[Marginal probability] With F — Si, we obtain, for 1 < k < n 
(3.9) (fc) := P ( gcd (Xj"' , . . . , , fc) = 1) = ^ E /^C?) [- J ' • 



For r = 1, u["'\k) is the proportion of numbers in {1, . . . ,n} which are coprime with fc, 
the familiar Legendre function. 

[Marginal expectation] With F = I, we obtain, for 1 < k < n, 
(3.10) Wy^\k) :^ E(gcd (X("\ . . . fc)) = ^ E^(j) 



3.1.1 Estimates 



In this section we record some estimates for the marginal probabilities Ur"'\k) and expec- 
tations Wr"'\k) that we shall need later on. Both estimates come about from comparing 
their respective expressions (3.9) and (3.10) with the analogous expressions that you get by 
removing the floor [ J , and which are quite more manageable as they do not contain n. 

Lemma 3.4 (Estimates for marginal probabilities and expectations). For any integers 
n,r > 1 and any integer k, with 1 < k < n, we have that 



(3.11) 



(3.12) 



ipr{k) T{k) 



k"- 



< r ■ 



< E # - w^''Hk) = - wrik) < 



J' 




These, of course, are standard bounds. See, for instance, D. H. Lehmcr (Lemma 4 in [25]) 
or Toth (equation (7) in [33]) for the case r = 1 of (3.11). 

Proof. We shall use that x"^ — lx\^ < rx'^~^, for any a; > 0. 
a) We may bound 



j2t^-ui-Hk) 



1 

71'' 



r v-^ 1 r v-^ T(k) 
< - > — r < - > 1 = r-^ 

j\k j\k 



n 



b) The fact that W^i"^(fc) < Pr{k)/k'' is immediate (see (2.9)). Finally, 



j\k 



For r = 1, we use that J2j\kf(j) ~ k, while, for r > 2, we use that J2j\kVi3)/^ 



< 

□ 
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3.1.2 Asymptotic behavior of means 

The average values of the marginal probabilities and expectations are, simply, 

1 " 

:=E„(C/(")) = -^C/(")(fc) =P(gcd(x("\...,X("),x(;\) = 1), 

k=\ 

fe=i 

From equations (3.2), (3.4) and (3.3), we have: 
Lemma 3.5. For each integer r > 1, 

hm u^"^ = ^- 

mil — I 1 \ ■ 

n-s-oo C^\r + Ij 

For each integer r >2, 

lim>)= 



C(r- + i) ' 

while, for r ~ 1, 

3.1.3 Asymptotic behavior of variances 

We denote by Cr"^ the variance of the marginal probability Ur^^: 



4") := ¥„([/(")) = E„(C/("'') - E„((7("))2 = - V C/W(fc)' - (- V C/i"'(fc)) . 



n ^ — ' \n 

k=l k=l 



Observe that we may interpret c["'' as the covariance 

(3.13) c[") = ^'^''{'^gcd{xi"\x!,"\...,xi."\xll\)=i ' '^gcd{x["\xl%,xll\,...,x^"^l,)=i) ' 

where each of the two gcd's involves ?- + 1 among the xj^'-^'s variables, sharing exactly one 
of them, x[^\ This interpretation follows by conditioning on the value of the common 
variable x}"-* . 

Appealing to the estimate of Lemma 3.4, we may compare second moments as follows 



I n ' — ' n ' — ' \ K' / n 

fc=i fc=i fc=i 



n ^ — ' \ n / 71 ^ — ' n 



n — ' \ n J 71 

fc=i fc=i 



where, besides, we have used that Ur^\k) < 1, that ipr{k) < k"^ and also that X]fc=i '''i^) — 
n{l + ln(n)) (see for instance [17], Theorem 320). 

From this, and recalling the definition (2.7) of Schur's constant s'^\ we deduce that 

lim ^-Y,ui-\kf = s^;\ 

fc=i 

and, consequently, in conjunction with Lemma 3.5, and since S^r^^ = 1/C('' + 1)5 
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Lemma 3.6. For any integer r > \, 

lim 4") = lim = S^;^ - {S[^'^f . 



Wc point out for later use that lim„ 
The analysis of the variance of the marginal expectation is a bit more involved. We 
introduce the notation dr for the variance of the marginal expectation Wr ■ 



k=l 



k=l 



in) 

Observe that we may interpret dr as the covariance 

(3.14) 4"^ = cov(gcd {X^-\x^-\ . . . , X^-\XI%) , gcd {X^-\X%, X^ . . . , X^^D) ; 

where, agam, each of the two gcd's nwolves r + 1 among the xj"^'s variables, sharing exactly 

one of them, X;^"-' . 

We compare second moments as follows: 



k=l 



k=l 



k=l 



< 



k=l 



where we have used that w!:"\k) < < T{k) (see (2.10) and (3.12)). 

Now, we appeal to Lemma 3.4. For r = 1, wc obtain that 

(3.15) \lj^wl-\kf~^-Y^(^)" <\Y.kr{k)<'^Y.r{k) <2{1 + Hn))., 



k=l 



k=l 



k=l 



where we have used that k < n and, once again, that '''i^) — "■(! + ln(n)). Notice that 

for r = 1 the bound obtained does not converge to 0. 
For r > 2, we have that 



(3,16) 



k=l 



k=l 



n ^ — ' n 

k=l 



k=l 



This bound does converge to 0, as n — > oo; this maybe be seen by recalling that T(fc) = 
Os{k^), for any S > (see [17], Theorem 315), or more precisely, by appealing to Ramanu- 
jan's asymptotic result that '''i^Y 2Cl2y"'(^^('^))'^' as n — )• cxd (see [17], second note 

on Chapter XVIII and the references therein). Incidentally, the bound X]fc=i kr^k) of (3.15) 
behaves asymptotically as ^n^ ln(n), as n — ^ oo, since X^fcLi = C(^ ~ 1)^ for 5R(z) > 2. 

In any case, in what follows we just need that the bound in (3.15) is o(ln(??))'^ and that the 
bound in (3.16) is o(l). 

We keep splitting the discussion into the case r = 1 and the case r > 2. We start with 
the latter. 

If r > 2, the bound in equation (3.16) converges to 0, as n — oo. Moreover, in this case. 
Lemma 2.9 gives 

n— fcxD ji ^ — ^ 



fe=l 
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We conclude that also 

lim -ywj:"Hkf = M{r), 



n— >oo 77, 



and, therefore, that 

lim V„(W^(")(fc)) = M(r) - (77^)' ■ 

Since 



({r + 1) j^+i ' 

we may, finally, write this limiting variance in the following appealing form: 
lhnV„(W^("))= ^^(gcd(,,)-l). 

l<i,j<oo 

Now we turn to the case r — 1. Notice that the bound in equation (3.15) is of the 

order ln(n), while, by Toth's Theorem, see Remark 2.10, the average ^ X]fc=i ("T") °f 
order In(n)'^. Therefore, 

- VW,(")(fc)'^ ATothln(r 

r) f ^ 



n 

k=l 



Finally, since 



k=l 

we conclude that 

V„(W^^"^)^ ATothln(n)3. 

We have proved: 
Lemma 3.7. For any integer r > 2, 

lim 4«) = lim V„(W^(")) ^ :g)^(gcd(z,j)-l), 

l<.i.j <Coo 

while, for r — 1, 

= V„ ( VFi^"^ ) - AToth ln(n)3 , as n ^ oo . 



4 Statistics of gcd of pairs 

Equipped with the estimates that we have gathered in the last two sections, in particular. 
Lemmas 3.5, 3.6 and 3.7, we arc now ready to approach the statistics of gcd of pairs of large 
sample of integers. We shall focus in the limiting behavior, as the sample size m tends to 
infinity, of the distribution of the following three basic statistics: 

_ \^ 1 

•-m — ^ -^gcd(X,<"\xj"') = l ' 
l<?'<j<m 

that counts the number of coprime pairs, 

l<i< j<m 



17 



which sums the gcd of pairs of the sample, and 

max {gcd(xf"\xj"))}, 

which gives the maximum gcd of the pairs of the sample. 

We should remark that the asymptotic results of Section 3.1 that pertain to this section 
on pairs are those with r = 1 (and not r = 2). 

4.1 Asymptotic distribution of the number of coprime couples 

We start with the counter 

_ \^ 1 

^rn 2^ gcd(X<"\Xj<"') = l' 

l<z<j<m 

a sum of (™) random variables, identically distributed but not independent. 
The expectation of Cm ^ is given by 

(4-1) E(C(:)) = Q E(l^^,(^<„, ^^„,^^^) = . 

Recall (Lemma 3.5) that, as n — >■ oo, 

^(lgcd(x<">,x^"))=i) = M"^ ;:zXo ■ 

Notice also that 

For the variance of Cm ^ we have: 
Lemma 4.1. The variance the variable Cm^ is given by 

(4.2) V(C(r)) = Q (1 - + mim 1)(™ - 2) c[-^ . 
Recall that 

hm = lim V„(C/}")) - 5f ^ - (5^ ¥ > 
(see Lemma 3.6). Therefore, since cj"-* > for each n > 2. wc have that 

(4.3) Ci := inf c["^ > 0. 

n>2 

Proof. The variable Cm'' is a sum of ('2) terms, so there will appear (™)^ terms in the 
expansion of its variance in terms of covariances of pairs of summands: 

• (™) individual variances ^i'^gcdi^x["-\xi"'>)=i'> = /^i"^ ~ ^i""*)! 

• m{m — l)(m — 2) covariances of the type 

C°^(lgcd(X<"',x("') = l' lgcd(X<"),X<"') = l) = V„(C/1"^) = cj"^ , 
(with exactly one X-"' in common); 
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plus ('2 ) (™2 ^) covariances of the type 



'=0^(lgcd(X<"',x("') = l'lgcd(X<"),xr)) = l) 

(with no Xj"'' in common). AU these covariances are 0, because of the independence 
of the Xj"^'s. 

Equation (4.2) follows. □ 

Consider now the collection of (™) variables Ig^^j-^C") ^^^^ * < j' 
tices of a graph F™^; there is an edge joining a pair of vertices lgj.^(-j5(-(") x*"')-i ^'^'^ 
"^gcd(x*"' X*"')— 1 ^^^^ indexes and {fc,Z} have exactly one index in common. 
Thus is the dependency graph of the variables {-^g^fj^j^f^^ x^"') iyi<i<j^n- 

We will now apply an asymptotic normality result of S. Janson [16], see also P. Baldi and 
Y. Rinnot ([1], particularly Proposition 5), concerning sums of (locally) dependent variables. 

Theorem 4.2 (Janson, Theorem 2 in [16]). Suppose that, for each integer t > 1, we 
have a family {Yi-^ , ■ • • , ^ijv } '^f bounded random variables, with almost sure common 
bound \Yi. \ < At- Let Mt he the maximal degree of the dependency graph Tt of the family 
{Yi^,...,Yi^^}. Denote 

and let = Y{St). If there exists an integer h > 3 such that 



then 



— as i —>■ 00, 



> Al , as r — > cx) . 

For the families {lgj,d(x^"' x<"')=i}i<i<j<n '^i^h dependency graphs Tm\ the corre- 
sponding parameters of Janson's Theorem are: number of variables N ~ (™) , maximal 
degree: M = 2(m — 2), uniform bound on the variables A ~ 1, and 

= (^) A^i"^ (1 - + - l)(m - 2) c["^ . 

Only in this last parameter the size n of the sample space intervenes, but actually, we may 
bound 

> m(m- l)(m- 2)Ci , 



where Ci is given in (4.3), as long as n > 2. Now, 

/iVyA MA^ f C^) 2(m-2) ^ ^ 

\m) a -V2(m-2)/ ^m(m - l)(m - 2)Ci ' 

as long as the integer ft, > 3. Summarizing, we have proved: 
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Theorem 4.3. The counter of coprime pairs C^n'^ is asymptotically normal: 



for any fixed n > 2. More generally, 



as m — >■ (X) . 



for any sequence as long as rim > 2, for each m 



M , as m — ^ cxD , 



It is perhaps more natural to consider Cm^ = (™) ^Cm '' , the average number of coprime 
pairs in the sample of size m. For fixed n > 2, we have as, m oo, 



r'W 1 



C(2) d^_^^ 



2^4^ 

So, for large m (the sample size) and n (the size of the sample space). 




Remark 4.4. Notice that, for n fixed, the variance of C^^ is of the order (™)^^'. This 
is to be compared with the variance of a sum of ('^) identically distributed and pairwise 
independent Bernoulli variables, which is of the order (™) , and with the variance of a sum 
of (™) identically distributed Bernoulli variables with constant positive correlation among 
them, which is of the order (™)'^- 

Remark 4.5. As we have mentioned in the introduction, Theorem 4.3 could be derived in 
the n constant case from the classical results of W. Hoeffding on normal approximation of U 
statistics, [19], see also [31]. The approach through dependency graphs appears to be more 
flexible, particularly when n is allowed to vary. In any case, asymptotic normality of standard 
U statistics could be derived from the dependency graph approach, sec Application C in [1]. 

Remark 4.6. There are good estimates for the rate of converge to normality for sums of 
locally dependent variables, for instance, [2], which could be applied to the variables Cm^- 

4.2 Sums of greatest common divisor of pairs 

We now deal with the random variable 

l<i< j<m 



The mean of Z^n^ is given by 



m 



where 

i/^"^ = E(gcd(XJ\X^"^) = E(VF^"^) ^ ln(7i) , 



, , , as n oo . 

C(2) 
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Recall, see (3.7), that 

E(gcd(x|"U^"V) 
and consequently, that, also, 



1/2C(2) 
3VC(3) 



V(gcd(xr),xr)))^[i(||-l) 



C(3) 

(n) 

For the variance of Zm , we have, with the same argument as in Lemma 4.1: 
Lemma 4.7. The variance of the variable Zm^ is given by 

(4.5) Y{Zi:^) = Q V( gcd ))) + m(m - l)(m - 2) 4"^ . 



This follows since 



cov{gcd{xi"\xt^gcd{xi-\xt'>) = 4") 
(see (3.14)). Recall from Lemma 3.7 that 

dj"^ ^ Axoth ln(n)^ , as n — >• oo . 
With all this, we can now prove: 
Theorem 4.8. The sum o/gcd of pairs, 2rri , satisfies 

Zrn ^ lE(-2^m ^ ) d . r 

. !► A/ , as m ^ CO 

Vv(^i"^) 

for any fixed n > 2. More generally, 

Zrn ^ ^{Zl^i ) d . f 

— > AJ , as m oo 

^Y{zt-^) 

for any sequence Um as long as rim ^ 2 for each m > 1, and that Um — 0{m^) as m ^ oo, 
for some (3 < ^. 

Proof. Wc follow the argument of Theorem 4.3, the case of sums of indicators. The (de- 
pendence) graph Fm'' is the same except that the vertices are now labeled by the variables 
gcd(x|"\ xj"''). The parameters pertaining Janson's Theorem are now: number of vertices 
N — (™), maximal degree M = 2(m — 2), bound on the variables A = n = Um, and 

^2 = Q V(gcd ')) + m(m - l)(m - 2) > m(m - l){m - 2) dj"' . 

Finally, for h an integer so large that p + \/h < i, 

rNy/h MA ^ / (") y/ft 2(m-2)?7,„ _^ ^ 

^^^^ -V2(m-2)J y^„,(^_i)(™_2)4") ™ 



since Um — 0{m^), and since d^"' ^ Axoth ln(7i)'^ (see Lemma 3.7). □ 



Remark 4.9. It would be interesting to determine whether a restriction on the rate of 
growth of the sample space size like < m'^, with /3 < i which we have imposed is 

(n) 

necessary for the asymptotic normality of Zm , and if that is so, what is the optimal rate. 
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4.3 Extreme statistics of gcd of pairs 

Wc now turn our attention to the random variable which registers the maximum of the 
greatest common divisors of pairs of the sample. 

A^(;')= max {gcd(x("\xj"))}. 

In [9] , Darling and Pyle studied the asymptotic behavior of the distribution of this variable, 
obtained some interesting results and asked whether its normalized version 

had a limit in distribution as m — > cx3 or not. The following theorem provides an answer 
Theorem 4.10. Let < n < e™' , for some (3 > 2 and 7 < i. Then, for any t > 0, 

Jim P(A^L"^<i)=exp(-^). 

In other terms, A^m^ tends, in distribution, as m —>■ 00, to the Frechet distribution with 
shape parameter 1 and scale parameter 1/(^(2). 

Observe that this convergence result requires that the size of the sampling space n tends 
to infinity along with m, the sample size, in contrast to the asymptotic normality results 
for the variables Cm ■* and Zm'^ , where the size of the sampling space n played a relatively 
secondary role (see Sections 4.1 and 4.2). Frechet distribution is one of the standard distri- 
butions used in Extreme Value Theory. 

Theorem 4.10 is a direct, and standard, consequence of the following result above Poisson 
convergence: 

Theorem 4.11. Let n be as in Theorem 4.10. Let t > and consider the random variable 
N^^Ht) = #{1 < ^ < J < m : gcd(xf\xj")) > tQ} . 

Then, for each fixed t > 0, the sequence {Nm\t)}ra converges in distribution to a Poisson 
variable of parameter A = J^i^ ■ 



7V(r)(t)^Poisson(^) 



as m —> 00. 



.tC(2). 

Proof of Theorem 4.10. Just observe that according to Theorem 4.11 



V{Mt^ >t)= P( ^ max gcd(xi"\ xj"') > t 



m 



P{Nil'\t) > 0) > 1 - exp f 



C{2)t 



□ 



In the proof of Theorem 4.11, we will use results of Silverman and Brown (see Theorem A 
in [32]) and of Brown and Silverman (see Theorem A in [3]) about Poisson convergence of 
{/-statistics, which, for the pairwise case we may write as follows: 



22 



Theorem 4.12 (Brown-Silverman). Let Yi,Y2, . . . ,Ym be iid random variables taking val- 
ues on some space S. Let g{x,y) be a symmetric function defined on and taking values 
and 1. Denote by T the counter 

l<i<j<M 



p = M^co^{g{YuY2),g{Y2,Y^)) 



Let X = E(T) and 
Then 

\P{T^k)~ P(Poisson(A) = fc)| < c( 



for each integer k > . 



where C is some absolute constant. 



Proof of Theorem 4.11. a) We shall require a simple estimate for the distribution function 
of the greatest common divisor of a random pair. The mass function of the gcd of a pair 
satisfies, (see, for instance, [12]), that, for I < j < n, 



P(gcd(x("\x("))=j) 



< 4 



1 + ln(n/j) 
nj 



< 4 



1 + ln(?7.) ^^ 1 
j 



We deduce that, ior < k < n 
(4.6) 



1 _ " 1 

P(gcd(x("),x("))>fc)-^^^1 



< 4- 



;i +ln(n))^ 



b) Wc will also need a convenient estimate of E(gcd(x}"\X2"'') • gcd{X2"'\ X^"''')^ , but 
recall (sec Lemma 3.7) that 

cov(gcd(x("\4")) , gcd(x("\x("))) = 4") ^ ATothln(n)3 , 
and, consequently, 

(4.7) E(gcd(x("\x(")) . gcd{X^-\x!r^)) = 0{Hnf) . 



Consider a sequence n = ti™ satisfying the conditions < Um < e™^ , for some (3 > 2 
and 7 < ^- Fix t > 0. To apply Theorem 4.12, we define the function 

9m{x,y) = lgcd(2;,y)>t('j) ' 

for I < x,y < n, and the random variable 

l<i<j<m 

which counts the number of random pairs with gcd bigger than 
Let us estimate the corresponding parameters A™ and pm- First, 

A™ = HTrn) = ( 2 ) P ( gcd {X["^ , > t ) . 
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We have that \im.m^oo = t^2j- '^'^ verify this, let K = [i(™)J, bound, using the 
estimate (4.6): 



and deduce, since > , with /? > 2, that 

hm sup Am < 



m—^oc 



Using K = \t(^)~\ , one gets analogously that liminf,„^oo Am > t^lj- 
Next, using estimate (4.7), we may bound 

< m'-^E{gcd{x[-\xt^) ■ gcd(x("\x("')) = lo(ln(n)^). 



Reverting to the notation of the statement of the theorem, and on account of Theo- 
rem 4.12 of Brown and Silverman, this estimate of pm implies that 

|P(iV^'.,i, ^ - P(P„i.„„(A„) ^ .)l < + ^) < C"(^ + ^) . 

since n = n„i < e™"*. Finally, since 7 < 1/3, this gives that 

Jim^P(iV(l')(t) = k)= P(Poisson(j^) = k) 

for any integer fc > 0, as desired. □ 

Remark 4.13. About the lower restriction on rim in Theorem 4.11 there is not much to 
say, since just the statement of convergence requires that n„i/(J^^ — > +00, but it would be 
nice to know what is the upper restriction required, if any. 

From Theorem 4.10, we deduce as a corollary an asymptotic concentration result of 
Darling and Pyle, [9], Theorem 1: 

Corollary 4.14. If n = Um satisfies n > , for some /3 > 2 and, also, n < e™^, for some 
7 < |, then, for any sequence > with limm-j-oo 5m = 0, we have that 

lim Pf m^5m < max gcd (X^"\xf ^) <m^^) =1. 

Remark 4.15. Notice that Darling and Pyle prove the above corollary for the sequence 
nm = e"™, where a is any positive number; a sequence which is beyond the range of our 
Corollary 4.14. It would be interesting to determine the optimal range of rates of growth 
of Um for the validity both of Theorem 4.10 and of Corollary 4.14. 



5 ^/-statistics for greatest common divisors of r-tuples 

We shall assume throughout this section that r > 3. Wc consider now [/-statistics summing 
over the collection of subsets of size r of the random sample of length m. 
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5.1 Number of relatively prime r-tuples 

Let us start with the variable 

= X! lgcd(X,j,...,X,J = l, 

l<zi <---<ir<m 

the sum of (™) terms counting the number of coprime r-tuples in a random sample of size m 
drawn uniformly from {1, . . . , n}. 
We have: 

Theorem 5.1. For fixed r > 3 and for any sequence n„i > 2, 



v(d?,l) 



Af 



as TO oo . 



The argument to prove Theorem 5.1 follows the same steps as the case of pairs; so that 
we shall only indicate some specific differences. The mean of Cm,r is given by 



^ = f n p(gcd(x("\ . . . = 1) = r 



recall, from Lemma 3.5, that lim„^oo Mr'-i = 



(n) 

To estimate the variance of Cm /r we now follow standard manipulations of [/-statistics. 
We need to consider some more covariances. Let us define, for < s < r, 

(5.1) -fi"J = cov(lg^^^^(„,^^^ ^^(„)_^M lgcd(x{"',...,x("',x<;\,...,x(:;lj=i) • 

(n) 

Observe that the two indicator functions involved in jr,s have exactly s of the variables 
xj"-* in common. Notice that 7,^."'' c|.l\, see equation (3.13), and that = 0, because 

(n) 

of the independence of the Xj s. Observe that, from the Cauchy-Schwarz inequality, 
(5-2) ^^J<^i:^-nK,^xi'^\xt\...,x^"b=i) 

for each < s < r. In fact (see, for instance, [31], p. 182), ji"^} increases with s, and, in 

(n) 

particular, 7r,s > 0, for < s < r. 

In terms of these covariances, the variance of Cm}r may be written as 



(5.3) V(C(„"))=^ 



s=0 



m\ /TO — s\ I m — r 
r — s I \ r — s 



/r,s ■ 



The product of binomial coefficients in the summand of index s of this expression counts 
the number of pairs of subsets of size r with intersection of size s drawn from {1,2,..., to}. 
Observe that, with s, r fixed and as m — >■ cxj, 

m\ I'm — s\ I'm — r\ 1 ^r-s 

s J \r — s J \r — s J s! (r — s)! 

We may trivially bound (just keeping the term s = 1 in (5.3) and using that 7^"i > 0), 

(5.") v(c«)>,„(7:;)(';:;)o«. 

Recall, see Lemma 3.6, that lim„-^oo c|."\ = 52^^ — {s['' ^^)^, a positive quantity. 
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Proof of Theorem 5.1. We shall apply again Janson's Theorem 4.2. Consider the (depen- 
dency) graph with (™) vertices labeled with the variables lgcd(Js:ij ....,Xi^)=i for 1 < ii < 
12 < ■ ■ ■ < ir n, and with an edge joining two vertices if they have at least one index 
of their labels in common. We record now the appropriate parameters in order to apply 
Theorem 4.2: the number of vertices N = ('"); the bound on the variables, A= 1, since the 
variables are just indicators; the maximal degree 



m\ m — r 



il/ = ( 1 — ( 1 — 1 ~' — m'' ^ , as TO — !• cx) , 

r J \ r J (r-l)\ 



and 



-r-1 



r— 1 J \r — 1 

Fix any integer h>i. For some constant Cr^h, we have that 

(N\^/hMA / nf TO*-! 1 _ TO^/'' 1 

\m) ^- ^A'^J to'- 1/2 - ^-^K 



,1/2 



,(«) 
-r-1 



which converges to as to — cx), whatever the sequence > 2. □ 



5.2 Sums of greatest common divisors of r-tuples 

For the variable 



l<2i <---<ir "^m 

which sums the greatest common divisors of all the r-tuples of a random sample of length to 
drawn for {1, . . . , n}, we have: 

Theorem 5.2. For fixed r > 3 and for any sequence Um of integers satisfying 2 < < , 
for some f3 < 1/2, 

' — > AJ as TO —> oo. 

The proof of Theorem 5.2 is a variation of the proof of Theorem 5.1. We just discuss a 
few of ingredients. 

The mean of Zm}r is given by 

mm - (7) nscd{xi-\ . . . = ('^^ ./"^ 

Let us define, tor < s < r, 
(5.5) 

= cov(gcd(x("), . . . , Xi-\X% . . . , X(")), gcd(x("\ . . . , Xi"), . . . , X^:?! J). 

Notice that w^""* = c?[,"\, see (3.14). Again, o;^*^' = 0, because of the independence of the 
gain, from Caucliy-Schwarz, 

(5.6) < = ¥(gcd(x("\x("), . . . ,X("))) 

for each < s < r. And again, — loI"'q < uji"'J < u!r,r tor < s < r. 
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The variance of may be written as 



s=0 

SO that we may bound 

,(ti 



/ m - 




/ m - 




> m 








\r- 




\r - 





Recall, see Lemma 3.7, that, for r > 3, 

i-e\= E 4^^(gcd(^,,)-i), 

l<?',j<oo 

which is a positive and finite (since r > 3) quantity. 

For the proof of Theorem 5.2, we just have to observe that the parameter A to apply 
in Janson's Theorem is now A ^ n, and this is why we require now the bound 
with P < 

Remark 5.3 (Extreme Statistics of the greatest common divisor of r-tuples). Fix > 3. 
It would be interesting to determine, if there is any at all, the corresponding approximation 
result for the maximum of gcd for j'-tuples. 

Let us see why the approach which we have followed for the case of pairs breaks down 
for r > 3. Following that approach, one would fix t > 0, consider the counter 

^"^^ E ■'-gcd(x,<^'',x<^"',...,xf;')>ts„ ' 

l<il<42<---l7-<n 

where {sm}m is some appropriate sequence, and expect to obtain convergence in distribution 
of Tm to a Poisson variable. 

Now E(T„) = (';^)P(gcd(xj"\x^"\...,xi"^) > ts„) should converge to the param- 
eter At defining the purported limiting Poisson distribution. The distribution of gcd of 
r-tuplcs satisfies, for 1 < j < n, that 



P(gcd(x("),x("\...,x("))=j)- 



1 1 



< Cr 



1 



Cir) f 

see, for instance, [14], and therefore, for 1 < fc < n, 



P(gcd(xj"\x("),...,X("))>fc)--l^ E 1 

' j=k+i ■' 



< Cr- 



With the forced choice of s™ — (™] "^^^ , and as long as ^rnr-i) ~^ co, we have that 

(7)p(gcd(x("),x("', . . . > t.„0 ^ = A. . 

The general result of Brown and Silverman (Theorem A of [3]) for Poisson convergence 
of {/-statistics requires that 

rn2'^-icov(lg^^^^(„)_^(„,_^^ ^0, 
as TO — > cxD. 
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If we simply estimate, as we did in the case of pairs, 
^°^(lgcd(x<"\x<"\...,x<'!\,x<;\)>ts„'lgcd(x("\x<''\...,x<^^^ 

<-^E{ gcd{xi-\xt\ Xt\,xil\) ■ gcd{x[-\xt\ xt\,X^-^)) . 
we get nowhere, because the expectation above is obviously at least 1, and 



which for r ~ 2 tends to with to, but for r > 3, our present case, tends to oo with m. 

To obtain an asymptotic approximation results for the maximum of gcd for r-tuples fol- 
lowing the approach which wc have followed one would need a better estimate, if possible, of 

<=°^(igcd(x<"\x<"\...,x<::\,jf<;\)>ts„'igcd(jf{"\x<"\...,xt^^^ • 
6 Higher moments 

Finally, we consider in this section ?7-statistics of moments, other than first, of gcd. We 
follow, of course, the general approach of previous sections, particularly. Section 5.2; we 
will just mention the few extra ingredientes needed to obtain the corresponding results for 
higher moments. 

We fix throughout this section the integer exponent g > 1 and the length r for the 
evaluation of gcd's, and consider 

2^")= ^ gcd«"\<"\...,<"y. 

I<il<i2<---<!'r<'" 

Observe that, departing from previous usage, we are not decorating Zm^ with the length r 
(or the exponent q). 

(n) 

For n fixed, n > 2, and as to — > oo, wc have asymptotic normality for Zm ■ This follows 
exactly as in Section 5.2. 

Theorem 6.1. Given a length r > 2 and an exponent q > I, then for fixed n > 2 

-2'r"'' ^ Scd (^X^^\ . . . , is asymptotically normal as to oo . 

For varying n = Um , the general approach hinges on estimating (from below) the covari- 



ance 



= cov(gcd . . . ,x("))^gcd . . . 

(just one variable X*-"^ in common). Now, as n ^ oo, 

ar - q) 



E(gcd 



for some constant Dj.^q (see [14]). 



C(r) 
ln(n) 



if g < r - 2 , 
if g = r — 1 , 



C(r) ' 

Dr,q ■ ni-''+^ , if g > r , 
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Conditioning on the value of X^"^ and using Cesaro's marginal formula, Lemma 3.3, we 
may write 

1 " 1 



n ^ — ' \n' 

k=l 



n 


r— 1^ 







We split the analysis of tt*^"^ into the three cases above, 
a) For q < r — 2, we first write 



1 " 1 



fe=i 

X! '^9(*)^'z(i)[7 



1 — 1 


n 


r-l 


n 


1 




-3- 




Llcm(i, j)- 


„2(r-l) ' 



and then bound 
Since 



5cd(i,j) , 



E 



J' 



gcd(* j) < X S^'^(^'^') < y gc'i^*' J') _ - '?)'^(2(r -q)- i f 



i,j<n ■' i.j>l 



mr-q)) 



(see [14]), we may conclude from dominated convergence that 



Also, 



hm E(gcd(x("\x("\...,X("))^) 
SO that, finally, we have, in this case, (7 < r — 2, that 



C(0 



hm .(")=f:^M^(gcd(.,,)-l) 



This means that we have asymptotic normality as long as < for some /3 < i. 

b) Case q = r — 1. We shall get that tt^"^ is at least of order ln(n)^. To see this use 
(twice) that [xj > ^x, if [xj > 1, to bound tt^"^ from below: 



— 4 Z-^ \ Z-^ jq I 4 



Vqii) fqU) f ^ n 



k=l j\k 



i,j<n 



iq jq \n Llcm(i, j). 



> 



E 



8 ^ z-J+i 

lcm(z,j)<n 



?cd(i,j) . 
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Appealing now to Corollary 2.5, we deduce that 



and further that 



since 



7r(") 1 
hmmf > -A, , 

n->oo in^nj"^ 8 



1 

liminf > tt^q 

rn-oo in(n)'^ 8 



C(2) 



The outcome of all this is again that we have asymptotic normality for Zm^ as long as 



'^m — "^'^ foi' some (i < J- 



We stop and record the consequence of the analysis in these two cases a) and b). 

Theorem 6.2. Given a length r > 2 and a exponent q > 1 with q < r — 1, then for any 
sequence satisfying 2 < and n^^^ < , with /? < ^ , 



" S^d (Xj^" , . . . , Xj^' is asymptotically normal as to —> oo . 

l<ii <. . . < V <m 

c) Case q > r. One would expect that both tt*^"' and oj*^"' would grow in this case as 
^2{q-r+i) ^ But we havc not been able to ascertain that. Nonetheless, if that were the case, 
then one would have asymptotic normality as long as 2 < n,„ and n^^^ < with /3 < i. 

7 Strong law 

The sequence of counters Cm,r indexed by to, with sample space {l,...,n}, n>2 fixed, and 
length r > 2 fixed, do satisfy a strong law of large numbers as to — > oo. 



Theorem 7.1. 



lim T^T— = 1 almost surely. 



In other terms, for almost all realizations of the complete sequence x^j^\ x''^\ . . . , the 

f C'"' (x'"\a:<"',...,a;'">) ~| 

sequence < "''^^ ^ — '~urr — " \ , where each sucessive term is calculated using the values 

of the given realization {a^i"^}^,, converges to 1. 
Since E(Cm,t) = we could also write 

Ctl (n) 

lim , ' = , almost surely . 

Recall that, as n — > oo, the mean A^l-I^ converges to 

There are general strong laws for [/-statistics which could be applied, but we prefer, 
given our previous estimates of variances and covariances of gcd's, to derive Theorem 7.1 
directly from the following (standard) lemma: 

Lemma 7.2. Let {Yjn)m be an increasing sequence of positive random variables in a prob- 
ability space, such that 
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1) E(y,„) increases to infinity at a polynomial rate, 

2) V(y„) < C E(K,„)^ for some 0<S <2. 
Then 

1 almost surely. 



Y 

lim , 



By "increasing at a polynomial rate" we mean that E(y,„) ~ Cm^ , for some /? > 0, as 
m — > oo. 

Proof of Theorem 7.1. Fix n > 2 and ?' > 2. Let us verify that Ym = Cm\ satisfies the 

hypothesis of Lemma 7.2 Obviously < < Ym+i- Besides, E(y,„) = E(C,t"l) = (™)Mr"''i 
grows at polynomial rate, with (3 = r. 
Recall, from (5.3), that 



s=0 



TON m — s\ ra — r 



r — s I \ r — s 



Now, since 7^"'' < uji^J , for s from s = to s = r, and taking into account that 7r,o(^) — Oj 
we may bound 



— r,r 



TO 

r 

TO 

r 



TON m — r 



r 

m — r 
r 



< Cr a;'';' to' 



") ™2r-l 



Since E(Cm r) = (™) /^r-i' second condition of Lemma 7.2 is satisfied, with S = 2~^. □ 

(n) 

For Zrn'r and even further for its q moment version, there are analogous strong laws. 
For completeness, a proof of Lemma 7.2 (modeled upon [13], Theorem 6.8) follows. 

Proof of Lemma 7.2. Chebyshev's inequality gives 



1 



V E(r„o 

This ensures that the subsequence 



>A)<iI(H<^ 



Y 

^ rr, 



E(r„ 



A2 E(y™)2 - A2 E(y™)2-*- 



1 as fc — ^ 00, 



if ruk = [A:'^-*)'' J . Now, for each m, such ruk < m < ruk+i, 



Ym ^ Yrrik+i ^{Ynik+i) 



E(y„) - E(y,„,^j E(y„j ' 

Since mk+i/mk — J- 1 as fc — > cx), and because of the polynomial rate condition, we deduce 
that, almost surely, 

Y 

limsup— ^ < 1 

m— >oo m/ 

An analogous estimate from below completes the proof. □ 
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